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Abstract. We revisit various PTAS's (Polynomial Time Approximation 
Schemes) for minimization versions of dense problems, and show that 
they can be performed with sublinear query complexity. This means that 
not only do we obtain a (1 + e)-approximation to the NP-Hard problems 
in polynomial time, but also avoid reading the entire input. This setting 
is particularly advantageous when the price of reading parts of the input 
is high, as is the case, for examples, where humans provide the input. 
Trading off query complexity with approximation is the raison d'etre of 
the field of learning theory, and of the ERM (Empirical Risk Minimiza- 
tion) setting in particular. A typical ERM result, however, does not deal 
with computational complexity. We discuss two particular problems for 
which (a) it has already been shown that sublinear querying is sufficient 
for obtaining a (1 + e)-approximation using unlimited computational 
power (an ERM result), and (b) with full access to input, we could get a 
(1 + e)-approximation in polynomial time (a PTAS). Here we show that 
neither benefit need be sacrificed. We get a PTAS with efficient query 
complexity. 

The first problem is known as Minimal Feedback Arc-Set in Tournaments 
(MFAST). A PTAS has been discovered by Schudy and Mathieu, and 
an ERM result by Ailon. The second is fc-Correlation Clustering (fe-CC). 
A PTAS has been discovered by Giotis and Guruswami, and an ERM 
result by Ailon and Begleiter. 

Two techniques are developed. The first solves the problem for the low- 
cost case of fc-CC (the analogous case is already known for MFAST). 
This requires a careful sampling scheme together with proof of a struc- 
tural property relating costs of vertices against the optimal sample clus- 
tering with their costs against the full optimal clustering. The second 
addresses the high-cost case, by showing that a classic method by Arora 
et al. (2002) for obtaining additive approximations can be made query 
efficient. The underlying technique is "double sampling": One sample 
is amenable to exhaustive solution enumeration, but well approximates 
only polynomially many solutions (including the optimal), and another 
sample cannot be used exhaustively search solutions, but well approxi- 
mates the cost of the enitre solution space, and is used for verification. 



1 Introduction 



We study two NP-Hard combinatorial minimization problems for which it is 
known how to get a (1 -I- e)-approximate solution under two scenarios. In the 



first scenario, the algorithm has full access to the input, and is required to com- 
pute in polynomial time. In the second scenario, the algorithm has exponential 
computational power but is allowed to uncover only a sublinear amount of input. 
In this work we show that no requirement needs to be sacrificed. In other words, 
we satisfy the following three requirements simultaneously: 

(Rl) A polynomial time algorithm. 

(R2) A (1 + e) approximate solution. 

(R3) A sublinear (in input size) query complexity. 

The first problem is known as fc-Correlation Clustering (fc-CC). Given an 
undirected graph G — {V,E)^ the objective is to find a decomposition of V into 
k (possibly empty) disjoint subsets (clusters) Ci, . . . , Cfc so that the symmetric 
difference between E and the set {(m, v) : 3i s.t. {m, u} C d] is minimized. The 
second problem is the Minimum Feedback Arc-set in Tournaments (MFAST). In 
this problem, given a tournament G = (V, A), the objective is to write its vertices 
in a sequence from left to right so that the number of edges pointing to the left 
(backward edges) is minimized|£| Requirements (Rl) and (R2) are achieved by 
Giotis et. al in for fc-CC and by Kenyon-Mathieu et. al in 10 for MFAST. 
Requirements (R2) and (R3) were achieved very recently by Ailon et. al in for 
fc-CC and by Ailon in [1^ of MFAST. In this work we obtain (Rl)-t- (R2)-|-(R3) 
for both problems. Our result uses components from the aforemention citations, 
together with new ideas required for obtaining our strong guarantees. 

1.1 Previous Work and Our Contribution 

In the world of combinatorial approximations. Correlation Clustering (CC) (also 
known as cluster editing) has been defined by Blum et al. |^ . In the original ver- 
sion there was no bound on the number k of clusters. Correlation clustering is 
max-SNP-Hard [8] but admits constant factor polynomial times approximations 
(e.g. |8I3) ). Maximization versions have also been considered [11^. In this work 
we concentrate on the minimization problem only, which is more difficult for 
the purpose of obtaining a PTAS. The fc-correlation clustering (fc-CC), in which 
the number of output clusters is bounded by fc, is also NP-Hard but admits a 
PTAS ^ running in time n^^^ ^ logn. There is a natural machine learning 
theoretical interpretation CC: The instance space is identified with the space of 
element pairs, and each edge (resp. non-edge) in G is a label stipulating equiva- 
lence (resp. non-equivalence) of the corresponding pair. The CC cost minimizes 
the risk, defined as the number of pairs of elements on which the solution dis- 
agrees with. Roughly speaking, an algorithm attempting to minimize the risk 
by, instead, minimizing an estimator thereof obtained by sampling labels, is an 
Empirical Risk Minimization (ERM) algorithm. An ERM algorithm need not 
be constrained by computational restrictions, and should be thought of as a 
information theoretical, not computational result. It should be noted that ma- 
chine learning clustering theoreticians and practitioners have been studying how 



^ By tournament we mean that for all distinct u,v £ V, either [u, v) £ Aor (v, it) £ A. 



to use correlation clustering type labels in conjunctions with more traditional 
geometric clustering approaches (e.g. /c-means - see Basu's thesis [7] and refer- 
ences therein). Such labels are expensive because they require solicitation from 
humans. Minimizing query complexity is hence important. 

From a combinatorial optimization point of interest, MFAST is NP-Hard |4i 
but admits a PTAS [TU] (see references therein for a more elaborate history of 
this important problem). The problem also has a machine learning theoretical 
interpretation, if we think of the directionality of the edge connecting u and v 
in the tournament T as a label. An ERM result has been obtained by Ailon [T] 
very recently. Interestingly, although Ailon's algorithm is not computationally 
efficient, it relies quite heavily on the ideas used in the PTAS [TO] . 

In this work we obtain requirements (R1),(R2) and (R3) simultaneously, for 
both fc-CC and MFAST. 

2 Notations 

For a natural number n we denote by [n] the set of integers {1, . . . , n}. Let V 
denote a ground set of n elements. In the fc-CC problem, V is endowed with an 
undirected graph G = (V, i?). A solution to the problem is given as a clustering 
C = {Ci, . . . , Cfe} oiV into k disjoint parts. We define =c to be the equivalence 
relation in which Ci, . . . , Cfc are the equivalence classes. Equivalently, we view 
a solution as an undirected graph G(C) ~ {V,E{C)) in which (m, w) S E{C) if 
and only if u =c v. The cost costG(C) of a solution C is the cardinality of the 
symmetric difference between the sets E and E{C). When the input is clear from 
the context, we will simply write cost(C). 

In the MFAST problem, V is endowed with a tournament graph T — (V, v4)0 
A solution is an injective function n : V t-^ [n] (a permutation). We define -<7r 
to denote the induced order relation, namely: u -<ti v if and only if 7r(u) < -Kiv). 
Equivalently, a solution can be viewed as a tournament T(7r) = (V, A^-k)), where 
(u, v) G A{Tr) if and only if u v. The loss costT(7r) of a solution is the number 
of edges (u, v) G T{t:) such that (w, u) G A. In words, a unit cost is incurred for 
each inverted edge. When the input T is clear from the context, we may simply 
write cost(7r). 

3 Statement of Results and Method Overview 

As in [nun], our query efficient PTAS for both fc-CC and MFAST, distinguishes 
between a high cost case and a low cost case. In MFAST, high cost means that 
the optimal solution has cost at least P(e)n^, where P[e) = 0{e'^). In fc-CC, 
high cost means that the optimal solution has cost at least Q{e,k)n^, where 
Q{e,k) =0{e^/k^^). 

In the low cost case, the problem has been solved for MFAST by Ailon [T]. 
There it is shown that 0{ne~'^ log"* n) edges from T are sufficient for finding a 
(1 -|- e)-approximate solution, in poly(n,e~^) time. We refer the reader to [T] for 
the details. As for the low cost case for fc-CC, we show a PTAS with o(n^) query 
complexity in in Section |4l The main idea of the algorithm is similar to that in 



* A tournament means that exactly one of (u, v) or {v, u) are in A for all u ^ v. 



[S], but defers in a significant way. Roughly speaking, both algorithms choose a 
sample of vertices and enumerate over /c-clusterings of the sample, while trying 
to compute optimal big clusters from the sample. In [5], for each such choice of 
sample fc-clustering, a clustering of V is chosen, and recursion is executed on 
the union of small clusters. Here, we use the sample in vitro to learn a strong 
structural property of any optimal solution for the entire input. In particular, 
we don't need to return from a recursion to perform this learning. 

For the high cost case we invoke an algorithm giving an additive eP(e)n^ 
(resp. £Q{£,k)n?) approximation for MFAST (resp. for fc-CC). To that end, we 
use a standard LP based technique together with another double sampling 
trick necessary for query efficiency, which we describe in Section[5]for the MFAST 
case (the k-CC case is easier). The main result there is as follows: 
Theorem 1. There exists a polynomial (in n) time algorithm for obtaining an 
additive eP{e)n'^ (resp. £Q{e,k)n'^)) approximation for MFAST (resp. for k- 
CC). The algorithm queries 0{e~'^P~^{e)n log n) (resp. 0(e~^Q~^(e, fc)n^) j in- 
put edges and runs in time „o(e-^p-^(e) iogP(£)) ^^^^^^ ^o(e-^Q-^{6,k)iogk) ^ 

In order to know whether we are at all in the high cost case, we apply the 
additive approximation algorithm in any case, and approximate the cost of the 
returned solution to within an additive error of 0{P{e)n^) (resp. 0{Q{e, k)n'^)). 
This estimation can clearly be done, with success probability at least 1 — n^^", by 
sampling at most 0{P~^{e) logn) (resp. 0{Q~'^{e, k) logn) ) edges, by standard 
measure concentration arguments. |j This bound is overwhelmed by the bounds 
of Theorem [TJ Our main results are summarized as follows. 
Theorem 2. There exists a PTAS for k-CC running in time n'-'^^ ^ ^°^^'> 
and requiring at most 0{e~^'^k^^n\ogn) edge queries. With probability at least 
1 — n~^, it outputs a clustering C with cost(C) < cost(C*)(l +£), where C* is an 
optimal solution. 

Theorem 3. There exists a PTAS for MFAST running in time n'-'^^ and 
requiring at most 0(£~^n log n + e~^nlog^ n) edge queries. With probability at 
least 1 — n"'^ , it outputs a permutation a with cosiijj) < cost(7r*)(l +e), where 
TT* is an optimal solution. 

Note that the running times are overwhelmed by the high cost case in both, and 
the query complexity is overwhelemed by the high cost case in Theorem [21 We 
also note that we did not make a real effort to optimize the constants, including 
the exponents of k,e. 

4 Query Efficient PTAS for Low Cost in fc-CC 

We study the low cost case of fc-CC on input G — {V,E), and analyze an 
algorithm satisfying (R1) + (R2) + (R3). We need two ingredients. In Section HTT] 

^ Note that the algorithm in Ij relies on a divide and conquer recursive strategy, 
in which the high cost algorithm and test must be implemented at each recursion 
node. This also holds for our fc-CC algorithm, which identifies large clusters and 
then recurses on small ones. The recursive calls must solve and test for the high cost 
case as well. 



we approximate the contribution of a single node v to the cost of any solution 
identical to the optimal solution except (maybe) for a change in the cluster to 
which V belongs. In Section W?^ we achieve the PTAS, using a strategy similar 
to that of Giotis et al. in [S]: Identification of the large clusters in the optimal 
solution and recursion on the remainder. Note that the algorithm of [9 does not 
satisfy (R3), hence ours makes better use of the queried information. 

4.1 An additive approximation of vertex costs 

A major component in our PTAS for fc-clustering is an additive approximation 
for the contribution of each vertex to the cost of the clustering. We start by for- 
mally defining this contribution, and then present Algorithm [1] and its analysis. 

Definition 1. Let C* — {Cj*,...,C^} he an optimal k-clustering, and assume 
its cost is ^r? for some 7 > 0. For v £ V let j*{v) be defined as the unique 
index such that v £ ^j(v)' ~ ^j*{v)' lti+« ^6 '^'^ indicator vari- 

able for the predicate (u, v) G E, and similarly define the complement lu~v = 
1 - lu+v Let deg+{v,j) = Y.ueC]\{v}'^n+v, deg_{vj) = Y.ueC]\{v} ^^-V' 
degout+(v, j) = J2ue(v\c;)\{v} '^n+v, and degont_{v,j) = J2ueiv\c;)\{v} 
Let cost*(u) = J^uecivMvy^-^-v + J^ut^civj^^+v- Notice that cost*(u) = 
deg_{v,j{v)) + degout^(u, and cost(C*) — 5 X^-u ^"'^ o,ny j € [k], 

let cost*(w, j) = deg_(w,j) + degont^{v,j). That is, cost*{v,j) is the contribu- 
tion of the vertex v to the cost of the clustering that is identical to C* , except the 
location of v, which is reset to C* . 



Algorithm 1 Additive approximation of cost* 

Input: A graph G = (V, E), a parameter /3 > and integer A: > 1 

Output: yv £V, and j £ [k], an estimation cost{v,j) to cost(?;, j) 

Choose S = (wi , . . . , ft), where t — clog(n)/3~^ (c is some sufficiently large 
universal constant), be a multiset of i.i.d. uniformly randomly chosen vertices from V. 
Let Si,...,Sk be an optimal fc-clustering for the reduced problem {S,E^s) (where 
E^s = E n {S X S)) , where the solution is found using exhaustive search. 

For any v e V, j e [k], let: deg_(u,j) = E„gs^.\{„} and degout+(w,j) = 

X^„g(s\s )\{u} lu+u- (The summations count elements of S with multiplicities.) 
Output for every v £ V , j £ [k] the estimation: 

cost(u, j) = (deg_{v,j) -|-degout+(«, j)j . 



The rest of this section proves the following guarantee of Algorithm [TJ 

Tlieorem 4. Fix f3 > 0, to be passed as paramater to Algorithmic There exist 
some universal constant c such that ifj< then for all v gV, j G [k] it holds 



that for ttie output of the algorithm, after possibly renaming the optimal clusters 
{Cl,...,C^}, cost(t;, j') — cost*(u, j) < /3n. For any input, Algorithm Ui will 
run in n^^^ ^°sk) ^jy^jg n^^d will require at most 0(nlog(n)/3~^) edge queries. 

The claim regarding the tirae and query complexity of the algorithm are 
trivial. Indeed, the time is dominated by exhaustively searching the space of k- 
clusterings of the sample S in the algorithm. We focus on proving the correctness. 
We need some more definitions. 



Definition 2. Let u,v ^ V , S a multi-subset of V,j £ [k] and S > 0. Let 

deg+(w,j) = E„e(c;ns)\{t.} '^u+v, deg'l{v,j) = E„e(c;ns)\{i,} 1"-"' degout^(u, j) = 
J2ue(s\c-;)\{v}'^n+v and degout_(u,j) = Y.ue{s\c*)\{v}^n^v, where the sum- 
mations take multiplicities in S into account. Let cost*'^(w) = deg^{v, j* (v)) + 
degout^(u, 

In what follows, set 5 — 0{(3^). Define S as the partition of S (from Algo- 
rithm H]) induced by C*. That is S = {Si, ...,Sk} where Sj = C* n S. 

Lemma 1. With probability at least 1 — n^^*^, for all v € V and j € [k], 

max{ I deg+(u, - degl{v,j)/\S\\, \ deg_iv, j)/n - deg'i{v,j)/\S\\, 

degont_^_{v, j)/n - degout+(?;, j)/l'S'IU degout„(i;, j)/'^ - degoutf (u, j)/|S'||} = 0{S) . 

The simple proof is deferred to Appendix [Xj From Lemma [TJ 

Lemma 2. Assume j = o{S). With probability l~n^^^, the cost of the partition 
S on the graph G\s ~ {S,E\g) is at most 0(b\S^\ 

In the following lemma we show that any pair of clusterings that are close 
w.r.t. to their edges are also close w.r.t. their vertices. 

Lemma 3. LetS,S be two k-clusterings of S, and let E(S), E{S) be their cor- 
responding edge sets, namely, {u, v) G E{S) if and only if u =s v, and similarly 
for S . Assume the size of the symmetric difference between E{S) and E{S) is at 
most ^l^l^, where 5 < c/k^ and c is a sufficiently small constant. Then for some 
reordering of indices, for every j G [k], max{|S'j \ Sj\, \Sj \ SjW — 0{5^^^\S\) . 

We will only present a main structural claim used by the proof. The remainder 
of the proof will be deferred to Appendix [Cl 

Proof. We start with an auxilary claim showing that every cluster in S has a 
similar cluster in S (and vice versa). 



Claim. Let C be a cluster of S. There exists some cluster £> in S* such that 
\C\D\< 0{5^/^\S\). 



Proof. Let D be a cluster in S that maximizes |I? n C|. Let A — D O C and let 
A — C\A. Notice that for every pair {u, v) G Ax^ the edge {u, v) is an element of 
E{S)\E{S). Hence, \ A\\A\ < S\S\^. If |C| < S^^^\S_\ then the claim holds trivially. 
If |C| > S^/^\Slthenweget: |^|(|C|-|^|) = \A\\A\ < d\S\^ < 6^/^\C\\ A simple 
calculation will show that either \A\ < 0{S'^/^\C\) or |^| > |C|(1 - 0{S^^^)). By 
setting the constant c to be sufficiently small we get that the first option implies 
1^1 < \C\/k which is impossible due to the fact that A maximizes jDnCj over all 
clusters D in S. definition of A. We conclude that |C\L»| 0{6^^^\S\), proving 
the claim. The remainder of the proof of Lemma [3] is deferred to Appendix [C] 

Let S = Si, . . . , Sk be an optimal /c-clustering of the induced input G\s- By 
Lemma[21 we know that with probability at least 1 — n~^° the cost of the solution 
<S is at most (JjS'p. By the triangle inequality, this implies that the symmetric 
difference between E{S) and E{S) is at most 0((5|S'p). Hence, we may apply 
Lemma [3] and assume that the clusters Si, . . . , Sk and Si, . . . Sk are aligned with 
ea(i other. Define: deg+(t;,j) = Y.ueSj\{v} deg_(i;, j) = 

degout+(w, j) = 'Eueis\s,}\{v} ^u+v, and degout_(t;, j) = J2ue{s\s,)\{v} ^u-v 
Lemma 4. With probability at least 1 — n^^, for all v G V and j £ [k], 



0{6^^^). The same is true for the other 'deg functions'. 



Proof. By the guarantee of Lemma[3l for all v E V,j G [k] , 



dog^(ii,j) 



\S\ \S\ 



0{S^'^). By the guarantee of Lemma [TJ we have that for all v G V, j G [k], 
deg^pj) _ des+iv,3) ^ 0(^1/3) ^hc claim foUows by union bounding and 
using the triangle inequality. This concludes the lemma's proof. 
Theorem]?] is now an easy corollary. 



4.2 The PTAS 

In this section we utilize the approximations to the costs of the vertices achieved 
in Algorithm [T] to achieve a PTAS for fc-clustering. We note that the heart of 
our contribution is the previous section, and the lemmas and proofs here follow 
the lines of [9]. The main algorithm (Algorithm 14. 2p is of course different since 
it utilizes the results of the previous section. 

Throughout this section we will assume that the optimal clustering C* has a 
cost of where 7 < ci/3^, where the parameter /3 will be taken as C2£/fc^, and 
ci,C2 will be sufficiently small constants so that Theorem |4] is satisfied. 

The remainder of the secion is dedicated to proving Theorem [2l Wc need 
some lemmas. In what follows, we assume that the invocation of Algorithm [T] is 
successful in the sense that the guarantee of Theorem |4] holds. The following is 
an immediate corollary of this guarantee. 

Lemma 5. Let v d V be a vertex satisfying v d Cj Cl C* , where i ^ j. Then 
cost* (y,j) < cost*(u) + 2/3n . 



Algorithm 2 PTAS for fc-CC (fow cost) 

Input: A graph G = {V, E), an integer k > 1 and a parameter e > 0. It is assumed that 
the optimal fc-CC cost of G is 7n^, where 7 < and /3 = c^ejl? . 
Output: A clustering C — {Ci, . . . , Cfc} of G. 

Run Algorithm[T]with inputs G,k and /3. Obtain approximations cost(t;, j) for all v £ V 
and j G [fc] . 

Create empty clusters Ci, . . .Ck- For all w G V add v to d, where i = 
argminj{cost(w, j)}. 

Reorder the clusters so that \C'i\ > ... > \Ck\- Let £ € [k] be such that \Ce\ > ^ and 
jCf+il < ^ (if no such integer exists, set £ = k). 

Run the algorithm recursively on the restriction of G on W = UjyeC'j , the integer k — £ 
and approximation parameter e(l — 1/k). Denote its output by Ce+i, ■ ■ ■ ,Gk- 
Output C^iCi^Ci,...,Ce = Ce, Ce+i, ...,Ck) ■ 



Define for any v G V, cost(w) = mmjfz[k] cost{v, j), where cost(u,j) is as 
defined in Algorithm [2] Define Vcostiy — {v £ V \ cost('i;) > csn/fc^}, where C3 is 
some sufficiently small constant. For any v G K;ostiy, cost*(u) > ^c^n/k'^ due to 
the guarantee of Theorem 2] and our choice of (3. Since (twice) the total optimal 
cost is bounded by that incurred by vertices in T^ostiy^ 

I Mostly I < ^W/icsn/k^) < iA-fnk^)/c3 . (1) 

In particular, using a very crude estimate, this means 

I Mostly I < C4,n/k, (2) 

where C4 is a constant that can be made sufficiently small by reducing ci as 
necessary. Recall that Ci, . . . ,Ci are the large clusters found by Algorithm |5) 
Notice that since there are k clusters, there must be at least one cluster of size 
— meaning that £ > 1. 

Lemma 6. For any j G [£], C* \ Mostly = Cj \ Mostly ■ 

The proof is deferred to Appendix [B] for lack of space. The next lemma states 
the existence of a clustering whose large clusters are identical to those found by 
our algorithm and has an almost optimal cost. 

Lemma 7. There exist some k-clustering ofV, V — (I?i, . . . , Dk) such that for 
all j G [£], Dj = Cj and cost(X») < 7^2(1 + e/k) 

Proof. Take V to be the clustering defined as follows. For any i £ [k], Di — [C* \ 
K;ostiy) U (Ci nl4ostiy)- That is, Di is the result starting with the clustering C and 
of moving the vertices of V^ostiy to the clusters according to C — {Ci, . . . , Cj,}. 

Denote by cost^('y) the cost of a vertex v w.r.t. the partition T). Notice that 
the only edges for which the clustering V pays for while the clustering C* does 
not must be incident to a node in Vcostiy Hence, 

cost(P) - cost(C*) < ^ (cost^(w) -cost*(w)) . (3) 



Assume ticostiy G Kostiy n Dj for some j e [k] . Clearly 

|cOSt^(Wcostly) - COSt*(Wcostly, j)| < iKostlyl , (4) 

because the only difference in such a vertex's cost can come from edges con- 
necting it to other vertices in V^ostiy • Now assume Ucostiy G Kostiy H C* n Dj for 
j i. By construction, fcostiy € By LemmaO this implies cost*(ucostiy, j) < 
cost*('i;costiy) + 2/3. By dl]) we conclude 

COSt^(Wcostly) - C0St*(i;costly) < 
(cOSt*(Wcostiy, j) - COSt*(fcostly, «)) + |K;ostly| < 2/3 + jKostly | • 

Plugging this into ^ and using ([T]), we get 

cost(I?)-cost(C*) < |ycostiy| (2/3n + |Kostiy|) < 7"' (8/3fcV(c3) + 167fcV(c3)') • 

The claim follows since 7 < < ^ ■ P ^ Ik ' Wk^ assuming small ci, C2. 

Proof (of Theorem\^. The claim regarding the query complexity is trivial given 
Theorem m The running time is a result of the recursion formula T{n,e,k) = 
^e-^k^-Uoeik) +T{n,e{l - l/fc),fc- 1) = n^-^k^' ioe{k) _ ^^^^ ^j^^^ ^j^g 

stated running time is doubly exponential in k wheras here it is singly exponential 
in k. This difference is due to a minor observation that the recursive call should 
be with the parameter £(1 — 1/fc) rather than e/10. The same minor change 
would result in a singly exponential dependence in k in the algorithm given in 
[9j as well. Let W be the union of the small clusters. That is, W = Uj^^^iCj. 
By lemma [71 all of the vertex pairs that are not contained in the set W x W 
incur a cost in C identical to that in V. Let di be the cost of V on pairs in 
W X W and let d2 be its cost on the remaining pairs V x V \ W x W. Since W 
is clustered recursively, we have that the cost of C is at most d2 + di{l + e/k) < 
{di + d2){l + e/k) ~ cost(2?)(l + e/k). The statement of the theorem follows. 

5 Query Efficient PTAS for High Cost 

We present a query efficient PTAS for the high loss case of MFAST. The query 
efficient PTAS for the high loss case of fc-CC is almost identical and is thus not 
presented. We will start by describing a known PTAS ((R1)-|-(R2)) based on an 
approach given by Arora et. al. We then show how to add requirement (R3). 
The final approach is summarized in Algorithm [3l found in Appendix |D] 

5.1 (R1) + (R2) using a Known Additive Approximation Algorithm 

Let TT* denote an optimal permutation, and let OPT denote cost(7r*) = cost(7r*). 
In the high cost MFAST explained in Section [Sj we assume OPT > jn^, 

where 7 = 0{e'^). Instead of directly solving MFAST, we solve the bucketed 
version. This idea is not new and can be found in e.g. |10| . An m-bucket or- 
dering a oi V is SL mapping a : V ^ [m], where for each i e [m] the preimage 
satisfies: ^ < |f7~^(i)| < For brevity we say that u v if a{u) < <7{v), 
and u =c V if a{u) — cr(w). We extend the definition of cost(-) to bucketed 

orders by defining cost(CT) = J2u<^v '^(v,u)gA ■ We will also need to define: 



cost"'"(CT) = lu<„v'i-(v,u)eA + lt)<.«l(«,t,)eA and cost"(cr) = 5 E^ev 

so that cost(CT) — ^jjgy cost^(i7). A permutation tt extends an m-bucketed 

ordering a if u v whenever u v. 

Observation 5 'JO] For any tt extending a, | cost(7r) — cost((T)| — 0(v? jrfi), 
hence for the purpose of obtaining a (1 + e)- approximate solution in our case it 
suffices to consider m-bucketed orderings with m — 0{l/{e^)). 

Let cr* denote any m-bucketed ordering of V of which tt* is an extension, and 
such that [n/mj < |((T*)^-'^(i)| < [n/m] for all i £ [m]. The following ap- 
proach has been taken in [5J. Let S = {vi,V2, ■ ■ ■ ,Vs) be a random series of 
s = 0{logn/ (e-f)'^) vertices in V, each element chosen uniformly and inde- 
pendently, with repetitions. Abusing notation, we will also think of S as the 
series {vi, . . . ,Vs}. For each m-bucketed ordering a and for each u G F, we 

make the following definitions: cost"''^(cr) = ^ J2i=i cost"'"' (cr) and cost'^(cr) = 
^^gy cost"''^(cr). Clearly cost'^(cr) is an unbiased estimator of cost((T) over the 
choice of the sample S. The top level of our algorithm will enumerate over 
all ne(iog(i/(^7))/(£7)') possibilities for the value of {a*{vi), . . . ,a*{vs)). From 
now on, we will assume the correct possibility has been chosen, so that a*{v) 
is "known" for v ^ S. A verification step will be used to identify the correct 
possibility in the end (see Algorithm [3] in Appendix [D]) . 

Definition 3. For an m-bucket ordering a, a vertex u ^ V and integer i £ [m], 
let (Ju^i denote the bucket order defined by leaving the value of a{v) unchanged 
for V ^ u and mapping u to i. More precisely: (Ju^i{u) — '^(v) if v =^ u, and 
<7u^i{u) = i. 

Note that au^i may not be exactly an m-bucket ordering. To be precise, we 
will say that au^i is an m-bucket* ordering whenever a is an m-bucket ordering, 
for every u G V and i G [m]. Clearly, Observation[S]holds for m-bucket* orderings 
as well, with a possible different constant hiding in the ©-notation. The following 
lemma is proven using standard measure concentration inequalities: 

Lemma 8. Fix an m-bucket ordering a ofV. With probability at least 1 ^n^^^, 
for all u GV and i S [m] : 

|cost"''^(CT„^j) - cost"(cr„^i)| = 0{ejn) . (5) 
By summing ([5]) over all (w, i) such that i = a'(u), we get 

Corollary 1. For any m-bucket order a with probability at least 1 — n~^^, 
|cost'^((T) — cost(cr)| = 0{ejn'^). 

Arora et al's LP approach [5j The benefit of Lemma [S] is the fact that ^ 
can be written as a pair of linear inequalities in variables {xvj)vev\{u},je[m]-: 
where x^j is indicator for the predicate "cr(u) = j". Indeed, cost"^'^((T„^i) is a 
known constant, and cost"((T„_).i) is a linear combination of (a^ujO^^^ujgfm] ■ This 
property allowed Arora et. al in [5] to introduce an LP over these variables. 



where the utihty function cost^{(j) is clearly a linear function of the system 
ixvj)v£V,j£[m]- Some obvious standard constraints are added: For all v,j, Xyj > 
and for all v, J2j£[m] ■'-vj ~ 1' ^^'^ course Xyj is hardwired as 1 (resp. 0) 
whenever v & S and cr*(u) = j (resp. cr*(v) ^ j). The almost balanced bucket 
constraint is also added: Wj G [m] : [n/mj < X^uev^fj — T"-/™! • The following 
arguments in ^ are by now classic: Randomly round the optimal LP solution by 
independently drawing, for each v V, from the discrete distribution assigning 
probability x^j to the j'th bucket. Denote the resulting m-bucket order a' . As 
argued in [5] , with high probability each constraint in the system will be satisfied 
up to a possible additive violation of magnitude depending on an £oo and an io 
(support size) property of the constraint. The precise statement is as follows: 

Lemma 9 (Essentially [5j). // the optimal solution to the LP x* satisfies 
X f3yjX*j < a for (3 G RI^I ^™ and a £ R, then with probability at least l — rj the 

rounded solution a' will violate the constraint by no more than ||/3||oo\/||/3||o log(l/'7), 
where ||/3||o is the number of vertices v £ V such that Byi ^ for some i G [m], 
\\(3\\oo — niax„gy jg[,„] \(3m\ and rj > {) is any number.^ 

In our case, consider an LP constraint coming from (jS]). Its corresponding 
coefficient vector /? satisfies ||/3||o < n and ||/3||oo < 1- We conclude that with 
probability at least 1 — n"^", ([5]) is satisfied with a — a' and for all and 
hence also the guarantee of Corollary Also, in virtue of the almost balanced 
bucket constraint and Lemma [9l with probability at least 1 — n~^^ the rounded 
solution a' is an m-bucket order. Additionally, by analyzing the coefficient vector 
/3utiiity corresponding to the LP utility function, with probability at least 1 — 
n~^° the cost cost'^(tT') is bounded by LP(a;*) + 0{e^n^), which is bounded by 
cost'^((T*)-|-0(e7n^) by LP optimality. Note also that the guarantee of Corollary[T] 
also applies to a = cr* with probability at least 1 — Combining using union 

bound and triangle inequality, one gets that cost((T') < cost(f7*) + 0{e^n^). We 
conclude the section with the following lemma that is implicit in [5^ 

Lemma 10. Given the correct bucketing on the vertices of S, one can construct 
a polynomially sized linear program whose rounded solution a' has the property 
cost(cr') < cost((T*) + 0{'-fen'^) with probability at least 1 — . 

5.2 Query efficiency 

The problem is that expressing inequality ([S]) in the LP requires complete knowl- 
edge of the input T. If we take a revised look at this strategy, we see that the 
sample S is not strong enough in the sense that it can be used to well approx- 
imate cost(cr) (per Corollary [1]) for no more than poly(n) m-bucket orders a 
simultaneously, but certainly not for all m-bucket orders. 

For each u € V randomly select a sample S'" = (u", . . . , Vp) of vertices of V, 
where p = 0((7e)~^ \ogn), each sample S*" is chosen independently of the other 

® We have implicitly viewed a' as a vector (o-^j)i,g\/jg[m]) with a'^^ indicator for a'{v) = 
i. 

^ All this happens with with possibly slightly worse constants hiding in the O-notation. 



samples, and the w"'s are chosen uniformly at random from V, with repetitions. 
Denote the ensemble {S*" : u 6 V} by S. For any m-balanced ordering tt on 
V, define cost""5(7r) = 5^ ELi cost"-^" (tt) and cost'5(7r) = T,u&v ^ost"-'^ (tt). It 
is not hard to see that cost'^(7i") is an unbiased estimator of cost(7r), for any tt. 
Using standard measure concentration bounds, we have the following: 

Lemma 11. With probability at least 1 — n^^'^, uniformly for all m-balanced 
orderings a on V, |cost'^(tT) — cost(f7)| = 0{ejn^). 

Lemma 12. Fix an m-balanced ordering a. With probability at least 1 — n~^^ , 
uniformly for all u and i G [m] , 

|cost"^^((7„^,) - cost""5(CT„^,)| = 0{e-/n) . (6) 

By summing ([5]) over all (w, i) s.t. i — (t(u), we get 

Corollary 2. Fix an m-balanced ordering a. With probability at least 1 — 
|cosf^(cr) - cost'^((T)| = 0{e^n'^). 

We build an LP as in Section 15.11 except that ^ replaces ([5]). Note that 
the coefficient vectors (5 of the new constraints now satisfy ||/3||o = 0{p) = 
0((7e)^^ logn) and ||/3||oo — 0{n/p) — 0{nj^e^ / log n). Using LemmaOand a 
similar analysis as in Section [5. 1[ an analog of Lemma [TUl can be proven. That is, 
we conclude that with probability at least 1 — n~^, the m-bucketed ordering a' 
outputted by rounding the optimal LP solution satisfies cost'^((T') < cost'^(CT*) + 
0(£7n^). By Lemma [Til this implies that cost((T') < cost(f7*) + 0{£^n'^). Algo- 
rithm [3] (Appendix [D| summarizes the query efficient PTAS for MFAST high 
cost case. The fc-CC high cost case can be solved in similar lines, though this 
case is slightly easier because the clusters need not be balanced. 

6 Discussion and Future Work 

We believe that in the low cost fc-CC case, there should be a PTAS with efficient 
query complexity, running in time poly(n, e~^, fc) (not exponential in fc,£^^), 
assuming the low cost case in each recursive instance. This is true for MFAST, 
and we leave the question of achieving it for fc-CC to future work. 
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A Proof of Lemma [T] 



This is a simple application of the following more general well known sampling 
principle: If Vi, . . . , Vm is a collection of subsets of V and T is a sample of N 
uniformly chosen elements from V (with repetition), then with probablity 1 — 77, 

\Vi\ \VinT\ 



for ah i = 1,...,M, 



|T| 



O v/^-ilog(M/77) 



B Proof of Lemma [6] 

We start by proving the inclusion 

Cj C C; U Mostly . (7) 

Assume for contradiction that there exist some v G Cj \ {C* U 14ostiy)- Let 
i G [k], i ^ j he such that w G C* . As w V^ostiy we know by Theorem |4] that 
cost*(w, i) = cost*(w) < cost*(w,j) < cost(w,j) + /3n < ^ + [in. We get: 

^ + 2n/3 > cost* {v,i)+ cost* {v,j) > \C*\ + |C;| - 1 , (8) 

where the right hand inequality is a consequence of the fact for any u e (C* U 
C*)\{v}, V will either incur a price w.r.t. u if it is included in C* or it is included 
in C*. Hence, 

max{|C;|, |q|} < ^ + 2n/3 + 1 < 1^ (9) 

for some constant cs that can be made arbitrarily small by tuning C2 (the con- 
stant product in /?) and C3. Since this holds for any i ^ j satisfying C* D {Cj \ 
K;ostiy) 7^ we have that 

\Cj I > \Cj\ - IFcostiyl - 2^ I I - 2fc ~ ~fc T ~r ' 

i: i/j,C*n(Cj\Vco,tly)/0 



where we used ^ and and ensure that 2c5 + C4 < 1/2. We derive a contra- 
diction to ©. 

We now prove that the inclusion C* C Cj U Vcostiy Notice that by Q, we 
conclude that \C*\ is lower bounded by ^ — C4) > as long as C4 < 1/4. 
Assume for the sake of contradiction that there exists some w G C* \ Vcostiy 
such that for some i ^ j , v ^ Ci. By the guarantee of Theorem [4] we have that 
cost* {v, j) < cost* {v, i) < cost(t), i) + (3n < csn/fc^ + /3n. This gives us again (|8]), 
leading to contradicting our lower bound on |C*| for sufficiently smaU C5. 
This concludes the lemma proof. 

C Continuation of Proof of Lemma [3] 

We now proceed with the proof of the lemma. 

Consider the following bipartite directed graph H = {U, F). The vertex set U 
is defined as follows: U — {C : C is a cluster in S} U {D : 13 is a cluster in S}. 
The edge set 7^ is defined as follows: For any cluster C & S, add a directed 
edge {C,D), where D maximizes \D' D C\ over clusters D' of S breaking ties 
arbitrarily. Symmetrically, for cluster D of 5 add a directed edge {D, C) where 
C maximizes |C" n over clusters C of S, breaking ties arbitrarily. Note that 
the out degree of all vertices of H is exactly 1. We will now define a bipartite 
matching on U using the following rule: If for some C, D, both {C,D) G F and 
{D, C) G F, then match C to D, and call the pair (C, D) a good match. The 
remaining (unmatched) vertices are matched arbitrarily , and the corresponding 
pairs are called bad matches. 

By the above claim, if (C, D) is a good match, then max{|C \D\,\D\C\} = 
Oh'/'\S\). 

We now show that if (C, D) is a bad match then both |C| = 0{6^/^\S\) and 
\D\ = 0{6^/^\S\). By symmety, it suffices to show that |C| = 0{S^^^\S\). Let 
C be a cluster of S that is a member of a bad match. Let D be the unique 
cluster of S such that (C, D) ^ F and let C" be the unique cluster of S such that 
{D, C) G F. By the definition of a bad match we know that C 7^ C . By the 
above claim we have that both \C \ D\ ^ 0{5^/^\S\) and \D \ C'\ = 0{S^/^\S\), 
which implies \C\C'\ = 0{6^^^\S\). But CnC" = 0, therefore \C\ = 0{5^^^\S\). 
This concludes the proof of the lemma. 



D Algorithm for High-Cost MFAST 



Algorithm 3 query efficient PTAS for MFAST 

Input: A graph T=(V,A) approximation parameter e and assumed minimal cost 
parameter 7 

Output: a permutation o :V [n] where n = 

For each u £ V randomly select a sample — {vi,...,Vp}, where p = 

0{{ej)~^ log(n)) and the vf's are chosen from V with repetitions. Denote the ensemble 
{S" : u G V} by S. (This is the verification sample.) 

Set S as a set of random i.i.d. vertices S = {vi, . . . ,Vs} chosen with repetitions, where 
s — C'((£7)~^ log(n)). (This is the enumeration sample.) 
Set m = 0((7e)~^) as the number of buckets. 

For each possible m-bucket order of the vertices of S perform the following: 

— Construct an LP as described in Section [5. 2 1 producing a fractional m-bucket order 
that agrees with the bucketing of S. 

— Solve the LP and round it as described in Section TS. II 

Pick the rounded solution whose approximated cost w.r.t. S (cost'^(-)) is minimal, and 
output an arbitrary permutation extending it. 



